Data Transformation for Improved Query Performance
نویسنده
چکیده
A database management system stores data in order to facilitate fast and efficient retrieval at a later point of time. The typical queries run on a database can be classified into three broad classes. They are range queries, k-nearest neighbor (k-NN) queries and box queries. Implementation of a box query typically involves simple comparisons among the query objects and index objects, however, implementing range queries and k-NN queries may be slightly involved. In this dissertation, we study mapping of one type of query on to other. From performance perspective, an index structure may favor one type of query over other. Hence, such a mapping provides a way of improving query performance. It also highlights relationships among the various query classes. Our first transformation maps a range query in L1 space on to a box query. This mapping provides a similar interface between the query space and data space of each of the index pages for Bounding Box based indexes. In 2-dimensional space, the mapping is exact with no false positives or false negatives. However, it cannot be used directly for higher dimensional spaces. We propose a novel approach called disjoint planar rotation in order to alleviate the problems in higher dimensions. We also develop a new type box query (called pruning box query) which is equivalent to the range query in original space. Our theoretical analysis shows that this mapping can improve I/O performance of the queries. Further, performance improvement increases with increasing number of dimensions. Experiments with some of the well known indexing schemes verify these findings. Due to underlying similarity in implementation, k-nearest neighbor queries can also be optimized using a similar transformation. We successfully apply this transformation to improve I/O performance of k-nearest neighbor queries as well. We next use a similar transformation to map box queries on to range queries L1 space. But the inherent property of box queries to allow varying degree of selectivity along each dimension, poses some challenges for the transformation. We propose square tiling approach to map each box query on to a number of square box queries. Each of the square box queries can then be transformed in to a range query. We demonstrate practical
منابع مشابه
Private Key based query on encrypted data
Nowadays, users of information systems have inclination to use a central server to decrease data transferring and maintenance costs. Since such a system is not so trustworthy, users' data usually upkeeps encrypted. However, encryption is not a nostrum for security problems and cannot guarantee the data security. In other words, there are some techniques that can endanger security of encrypted d...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملDiscovering Popular Clicks\' Pattern of Teen Users for Query Recommendation
Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...
متن کاملAn Effective Path-aware Approach for Keyword Search over Data Graphs
Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...
متن کامل